In this exploratory data analysis, our group examines dengue fever cases in the city of Mérida, Mexico, using data provided by Professor Clennon of the Emory Environmental Science Department. Dengue fever is a mosquito-borne viral infection caused by the dengue virus, which poses a significant public health concern in tropical and subtropical regions.
Some of dengue fever’s most common symptoms include sudden drops in blood pressure, nausea, and muscle pain. Researchers have found that Merida, the capital of the state of Yucatán, experiences recurrent dengue outbreaks due to its tropical climate, seasonal rainfall and standing water, and dense urban population density.
Our group is interested in public health, epidemiology, and the positive effect that well-developed data analysis can have on urban welfare. In part, this interest is a result of the COVID-19 pandemic and widespread reporting on its pathology, treatment, and mitigation. Dengue’s effects are similar to COVID-19’s, and the disease similarly impacts communities worldwide. However, its relatively slower spread and its prevalence predominantly in tropical areas with underdeveloped healthcare infrastructure have led to severe neglect compared to other dangerous diseases. This study is primarily an exercise in geospatial and temporal analysis of viral disease.
Our dengue dataset includes 540 rows, each representing a distinct region within Mérida. Each region is quantified based on its coordinates from the Universal Transverse Mercator (UTM) system, a grid-based coordinate system used to map locations using meters. The dataset initially had 8 columns: the X and Y UTM coordinates of each region, SP_ID (an identifier for each region), CVEGEO (geographic identifier code used by the Mexican Geographic Agency), and the number of cases in each region across four different years (2012-2015). As we explain in the Data Cleaning step, SP_ID and CVEGEO are not necessary for our analysis here. Additionally, we have included complementary data on ovitrap egg density (the quantity of mosquito eggs recorded for 4177 ovitraps distributed throughout the city). Although it is not the main focus of our analysis, it provides some insight into potential examination of the relationship between mosquito breeding sites and dengue incidence.
Before developing any visualizations or analytical models, we prepared the R environment by loading our primary dataset, “Merida_Den12_13_14_15.csv,” along with a complete suite of R packages. We used the tidyverse package, which provides core tools like dplyr for data transformation and ggplot2 for static visualization. For our more advanced interactive plots, we loaded plotly (used for the animated density map) and leaflet (used for the interactive Year-over-Year map). We also included sf to read and handle the simple features (geospatial) data for Merida. Once all libraries were loaded, we imported the dataset into a data frame. The resulting data frame contained 540 rows and 8 columns, representing spatial identifiers, coordinate values, and annual dengue case counts. This formed the foundation for our exploratory analysis.
Data Cleaning is a crucial step to ensure the integrity and relevance of the data prior to our analysis. We want to ensure that all the data required for this project is clean, validated, and standardized so that our group can create data visualizations using the shared data.
The raw dataset contains several columns, including SP_ID and CVEGEO, which serve as identifiers for position and case record. After discussing the goals for the project, we determined that our exploratory analysis is focused exclusively on the relationship between spatial coordinates and case counts. Therefore, the important features for this analysis are the X and Y coordinates and the annual case count columns (Den2012 through Den2015). The SP_ID and CVEGEOCVEGEO columns, while potentially useful for future work like joining with demographic or climate data, are not required for our current visualization and modeling goals. Therefore, we dropped the project’s irrelevant columns, creating a standard data frame named data_cleaned.
Next, we checked the dataframe for missing data because null values can introduce significant bias, propagate errors in calculations, and cause visualization failures. The R console output confirms that all columns returned 0 null values. This finding validates the completeness of our dataset, allowing us to proceed with our analysis without further cleaning procedures, ensuring our results are based on the full set of observations.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.4.3
## Warning: package 'tidyr' was built under R version 4.4.3
## Warning: package 'readr' was built under R version 4.4.3
## Warning: package 'purrr' was built under R version 4.4.3
## Warning: package 'dplyr' was built under R version 4.4.3
## Warning: package 'forcats' was built under R version 4.4.3
## Warning: package 'lubridate' was built under R version 4.4.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.1 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(ggplot2)
library(plotly)
## Warning: package 'plotly' was built under R version 4.4.3
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
library(leaflet)
## Warning: package 'leaflet' was built under R version 4.4.3
library(sf)
## Warning: package 'sf' was built under R version 4.4.3
## Linking to GEOS 3.13.0, GDAL 3.10.1, PROJ 9.5.1; sf_use_s2() is TRUE
library(htmltools)
library(scales)
##
## Attaching package: 'scales'
##
## The following object is masked from 'package:purrr':
##
## discard
##
## The following object is masked from 'package:readr':
##
## col_factor
data <- read.csv('Merida_Den12_13_14_15.csv')
head(data)
## X Y SP_ID CVEGEO Den2012 Den2013 Den2014 Den2015
## 1 231639.6 2317510 0 3104100010070 17 6 2 2
## 2 231323.1 2316622 1 3104100010085 35 15 10 18
## 3 230769.7 2316113 2 310410001009A 35 14 8 17
## 4 233250.8 2316208 3 3104100010136 40 8 6 10
## 5 233266.3 2317442 4 3104100010140 12 8 4 4
## 6 232736.6 2317238 5 3104100010155 30 8 1 5
data_cleaned <- data %>%
select(X, Y, starts_with("Den"))
na_counts <- colSums(is.na(data_cleaned))
print(na_counts)
## X Y Den2012 Den2013 Den2014 Den2015
## 0 0 0 0 0 0
head(data_cleaned)
## X Y Den2012 Den2013 Den2014 Den2015
## 1 231639.6 2317510 17 6 2 2
## 2 231323.1 2316622 35 15 10 18
## 3 230769.7 2316113 35 14 8 17
## 4 233250.8 2316208 40 8 6 10
## 5 233266.3 2317442 12 8 4 4
## 6 232736.6 2317238 30 8 1 5
### Reorganize the dataset so each year's dengue fever count is stored as a separate row
data_formatted <- data %>%
pivot_longer(
cols = starts_with("Den"),
names_to = "Year",
values_to = "Count"
) %>%
mutate(Year = as.numeric(gsub("Den", "", Year)))
### Summarize the total dengue fever cases per year and calculate Year-over-Year percent change
data_summary <- data_formatted %>%
group_by(Year) %>%
summarise(
total_dengue = sum(Count, na.rm = TRUE))%>%
arrange(Year) %>%
mutate(
percent_change = (total_dengue - lag(total_dengue))/lag(total_dengue) * 100
)
### Visualize total dengue fever cases per year and Year-over-Year percent change
ggplot(data_summary, aes(x = Year)) +
geom_line(aes(y = percent_change * 100), color = "red", size = 1.2) +
scale_y_continuous(
name = "Total dengue Cases",
sec.axis = sec_axis(~./100, name = "% Change from Previous Year")
) +
geom_col(aes(y=total_dengue), fill = "blue")+
theme_minimal(base_size = 12) +
labs(
title = "Total dengue Cases and Year-over-Year % Change in Merida",
x = "Year"
)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
When it comes to diseases, outbreaks, and other public health concerns, it is crucial to monitor the volume of cases to understand if the outbreak is being contained. Starting off as simply as possible, we created a first graph that summarizes the broader statistics of our dengue fever dataset in Merida – more specifically, we created a visualization that summarizes the amount of dengue fever cases from 2012 to 2015. This was conducted by grouping the cases by what year they were from, and then counting how many cases were in each year. The graph shows 2012 having the highest volume of dengue fever cases of almost 10,000, which has a significant dropoff in 2013. Following this trend, 2014 also sees a decline in total dengue fever cases, until another sharp increase of cases comes in 2015 of around 7,500 cases.
Despite being able to visually see the fluctuations in volume of dengue fever cases, our group decided to create a percent change, year-over-year, visualization over the same chart. This method of calculating percent change is useful in tracking the rate of increase or decrease of the dengue fever, which allows us to not only understand the spread of the disease, but how quickly it is spreading. By using geom_line in our visualization, the red line helps our audience understand the relative changes in dengue fever cases over the years. When the percent change is sharply negative, like in 2013 and 2014, that could indicate successful containment plans, while sharp positive changes in the red line could signal a public health crisis.
The pertinence of this visualization is that it helps with public health tracking and surveillance. Despite being a more general statistical visualization, this graph can help track the trends of how quickly dengue fever is spreading in Merida, while also assessing the effectiveness of interventions in certain years, assuming that certain confounding variables are controlled for – other variables such as seasonal changes and weather variation could certainly impact our future findings. Ultimately, data tracking leads to action, and in the case of dengue fever, it can lead to mosquito control initiatives or vaccination campaigns.
This initial analysis provides a strong foundation for conducting a geospatial analysis on where these dengue fever cases are specifically occurring.
Before we further analyze dengue incidence over time, let’s briefly look at some data for mosquito ovitraps. Ovitraps allow researchers to estimate the number of mosquitoes in an area by replicating breeding environments and measuring egg count. As dengue is a mosquito-borne illness, we should keep in mind the relative density of mosquito eggs as we look at disease hot spots later on. For the purposes of this analysis, we will not run statistical tests examining the exact assocation between estimated mosquito density and dengue incidence, as we intend to focus more on geospatial and temporal analytics.
map <- st_read("Merida_S25/Merida_Den12_13_14_15/Merida_Den12_13_14_15.shp")
## Reading layer `Merida_Den12_13_14_15' from data source
## `C:\Users\pompo\OneDrive\Documents\GitHub\dengueAnalysisTechWriting\Merida_S25\Merida_Den12_13_14_15\Merida_Den12_13_14_15.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 540 features and 6 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 212287.2 ymin: 2309495 xmax: 236386 ymax: 2333181
## Projected CRS: Mexico ITRF2008 / UTM zone 16N
ovitraps <- st_read("Merida_S25/Ovitraps_Sum/Ovitraps_Sum.shp")
## Reading layer `Ovitraps_Sum' from data source
## `C:\Users\pompo\OneDrive\Documents\GitHub\dengueAnalysisTechWriting\Merida_S25\Ovitraps_Sum\Ovitraps_Sum.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 4177 features and 50 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 219360.9 ymin: 2312502 xmax: 233659.3 ymax: 2329610
## Projected CRS: Mexico ITRF2008 / UTM zone 16N
# Create the plot with a gradient from yellow to red, representing relatively low to relatively high egg density. Overlay the plot on a blank regional map of Merida.
ggplot() +
geom_sf(data = map, color = "black", linewidth = 0.3) + # Create map outline
geom_sf(data = ovitraps, aes(color = Total), size = 2, alpha = 0.9) + # Generate ovitrap points
scale_color_gradient(low = "yellow", high = "red", name = "Eggs per Trap") + # Add color gradient to represent eggs per trap
labs(title = "Ovitrap Egg Counts in Merida") # Add title
# Reshape dataframe from wide to long format, each year becomes a separate row with 'Cases' column
data_long <- data_cleaned %>%
pivot_longer(
cols = starts_with("Den"), # select all year columns
names_to = "Year", # new column for year
names_prefix = "Den", # remove 'Den' prefix
values_to = "Cases" # new column for case counts
) %>%
# Convert year column to integer type
mutate(Year = as.integer(Year))
# Create animated scatter plot of dengue cases
fig <- plot_ly(
data_long, # use long-format dataframe
x = ~X, # X coordinate
y = ~Y, # Y coordinate
frame = ~Year, # animate by year
type = 'scatter',
mode = 'markers',
color = ~Cases, # color by case count
colors = "Reds", # color scale
marker = list(
size = 10,
line = list(width = 0.3, color = 'black') # add thin black border
),
# Add hover text with detailed info
text = ~paste(
"Year:", Year,
"<br>Cases:", Cases,
"<br>X:", round(X, 2),
"<br>Y:", round(Y, 2)
),
hoverinfo = 'text' # display hover text only
) %>%
# Add color bar to indicate case counts
colorbar(title = "Case Count") %>%
# Update layout, axis titles, and background colors
layout(
title = 'dengue fever Density in Merida (2012–2015)',
xaxis = list(title = 'X Coordinate'),
yaxis = list(title = 'Y Coordinate'),
plot_bgcolor = "white",
paper_bgcolor = "white"
) %>%
# Set animation speed and transition style
animation_opts(
frame = 3000, # 3 seconds per frame
transition = 800, # 0.8 second transition
easing = "linear" # linear transition
) %>%
# Add animation slider for selecting year
animation_slider(
currentvalue = list(
prefix = "Year: ", # label prefix
font = list(color = "red")
)
)
# Display the interactive animated plot
fig
We can further analyze the evolution dengue virus in the Merida region over a period of time with an animated density chart. By using plotly, we can visualize high-density infected regions in red, compared to low-density infected regions in white, and due to the animation, can see the spread of the disease throughout the 2012-2015 timespan.
When analyzing the density chart in 2012, we can see a significant outbreak as there are two primary hotspot regions with case counts exceeding 100. It is important to note that these hotspots are not isolated. They are surrounded by dengue-infested locations reporting nearly 50 cases, indicating a concentrated and widespread initial event.
If we advance to 2013, we can see a surprising city-wide reduction in dengue incidence. There are far fewer cases overall, with no single region in Merida reporting over 100 cases and only a handful approaching 50 cases. Overall, the area is dominated by low-incidence areas, suggesting the outbreak has died down. This trend continues into 2014, which shows an even more pronounced decline as the vast majority of regions report zero cases, and the most significant hotspot contains only around 30 cases.
However, when analyzing 2015, we can see a powerful resurgence. While many areas remain low-incidence, two new, intense hotspots emerge with case counts approaching 150. Additionally, the space between these two hotspots contains a dense, concentrated cluster of locations reporting nearly 100 cases each.
To sum up, this visualization demonstrates that disease spread in this urban environment is not random, but rather persistent and spatially clustered in specific, identifiable high-risk zones. The 2015 outbreak also demonstrates that new high-incidence areas are likely to appear adjacent to or within previous outbreak zones, confirming the geospatial nature of disease spread.
# 1. Convert data to sf and transform to WGS84 for Leaflet
# We use EPSG:6371 as identified in the original Section 7
data_sf <- st_as_sf(data, coords = c("X", "Y"), crs = 6371)
data_sf_wgs84 <- st_transform(data_sf, crs = 4326)
# 2. Load and transform the neighborhood shapefile
town_map_raw <- st_read("Merida_S25/Merida_Den12_13_14_15/Merida_Den12_13_14_15.shp")
## Reading layer `Merida_Den12_13_14_15' from data source
## `C:\Users\pompo\OneDrive\Documents\GitHub\dengueAnalysisTechWriting\Merida_S25\Merida_Den12_13_14_15\Merida_Den12_13_14_15.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 540 features and 6 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 212287.2 ymin: 2309495 xmax: 236386 ymax: 2333181
## Projected CRS: Mexico ITRF2008 / UTM zone 16N
town_map_wgs84 <- st_transform(town_map_raw, crs = 4326)
# 3. Create a sequential color palette for absolute cases
# We find the global max to keep the scale consistent across years
all_cases <- c(data_sf_wgs84$Den2012, data_sf_wgs84$Den2013, data_sf_wgs84$Den2014, data_sf_wgs84$Den2015)
global_max_cases <- max(all_cases, na.rm = TRUE)
pal_cases <- colorNumeric(
palette = "Reds",
domain = c(0, global_max_cases),
na.color = "#a0a0a0"
)
# 4. Create popups for each year
popup_2012 <- ~lapply(sprintf("<strong>ID: %s</strong><br/>2012 Cases: %d", SP_ID, Den2012), HTML)
popup_2013 <- ~lapply(sprintf("<strong>ID: %s</strong><br/>2013 Cases: %d", SP_ID, Den2013), HTML)
popup_2014 <- ~lapply(sprintf("<strong>ID: %s</strong><br/>2014 Cases: %d", SP_ID, Den2014), HTML)
popup_2015 <- ~lapply(sprintf("<strong>ID: %s</strong><br/>2015 Cases: %d", SP_ID, Den2015), HTML)
# 5. Create the interactive leaflet map
m_cases <- leaflet(data_sf_wgs84) %>%
# Add base map layers
addProviderTiles(providers$CartoDB.Positron, group = "Light Map") %>%
addProviderTiles(providers$Esri.WorldImagery, group = "Satellite") %>%
# Add Neighborhoods layer
addPolygons(
data = town_map_wgs84,
group = "Neighborhoods",
fillOpacity = 0.1,
color = "#444",
weight = 1,
opacity = 0.7,
popup = ~SP_ID
) %>%
# Set the map to focus on the data
fitBounds(
~as.numeric(st_bbox(geometry)[1]),
~as.numeric(st_bbox(geometry)[2]),
~as.numeric(st_bbox(geometry)[3]),
~as.numeric(st_bbox(geometry)[4])
) %>%
# ==== Layer 1: 2012 Cases ====
addCircleMarkers(
group = "2012 Cases",
color = ~pal_cases(Den2012),
fillOpacity = 0.8,
stroke = FALSE,
radius = 5,
popup = popup_2012,
label = ~as.character(Den2012)
) %>%
# ==== Layer 2: 2013 Cases ====
addCircleMarkers(
group = "2013 Cases",
color = ~pal_cases(Den2013),
fillOpacity = 0.8,
stroke = FALSE,
radius = 5,
popup = popup_2013,
label = ~as.character(Den2013)
) %>%
# ==== Layer 3: 2014 Cases ====
addCircleMarkers(
group = "2014 Cases",
color = ~pal_cases(Den2014),
fillOpacity = 0.8,
stroke = FALSE,
radius = 5,
popup = popup_2014,
label = ~as.character(Den2014)
) %>%
# ==== Layer 4: 2015 Cases ====
addCircleMarkers(
group = "2015 Cases",
color = ~pal_cases(Den2015),
fillOpacity = 0.8,
stroke = FALSE,
radius = 5,
popup = popup_2015,
label = ~as.character(Den2015)
) %>%
# ==== Legend ====
addLegend(
pal = pal_cases,
values = c(0, global_max_cases),
title = "Case Count",
position = "bottomright",
opacity = 1
) %>%
# ==== Layer Control ====
addLayersControl(
baseGroups = c("2012 Cases", "2013 Cases", "2014 Cases", "2015 Cases"),
overlayGroups = c("Light Map", "Satellite", "Neighborhoods"),
options = layersControlOptions(collapsed = FALSE)
)
# Display the map
m_cases
# 1. Calculate Year-over-Year (YoY) percentage change
# We must handle cases where the denominator (previous year) is 0.
calculate_yoy <- function(current, previous) {
case_when(
previous == 0 & current == 0 ~ 0, # 0 to 0 is 0% change
previous == 0 & current > 0 ~ Inf, # 0 to N is infinite % change
previous > 0 ~ (current - previous) / previous
)
}
data_yoy <- data %>%
mutate(
YoY_2013 = calculate_yoy(Den2013, Den2012),
YoY_2014 = calculate_yoy(Den2014, Den2013),
YoY_2015 = calculate_yoy(Den2015, Den2014)
)
# 2. Clean up Inf/-Inf/NaN values for visualization. We'll set Inf to NA.
data_yoy <- data_yoy %>%
mutate(across(starts_with("YoY_"), ~if_else(is.finite(.), ., NA_real_)))
# 3. Convert to a geospatial 'sf' object and transform coordinates
# The .prj file (EPSG:6371) tells us this is Mexico ITRF2008 / UTM zone 16N.
# Leaflet requires standard WGS84 (EPSG:4326) coordinates (lat/lon).
data_sf <- st_as_sf(data_yoy, coords = c("X", "Y"), crs = 6371)
data_sf_wgs84 <- st_transform(data_sf, crs = 4326)
# 4. Define a diverging color palette
# We cap the domain from -1 (-100%) to 2 (200%) to handle outliers
cap_range <- c(-1, 2)
pal_yoy <- colorNumeric(
palette = "RdYlBu",
domain = cap_range,
na.color = "#a0a0a0",
reverse = TRUE # Make red positive (increase), blue negative (decrease)
)
# 5. Create formatted popup and label content
# Popups appear on click
popup_2013 <- ~lapply(sprintf(
"<strong>Location ID: %s</strong><br/>2012 Cases: %d<br/>2013 Cases: %d<br/>YoY Change: %s",
SP_ID, Den2012, Den2013, scales::percent(YoY_2013, accuracy = 0.1)
), HTML)
popup_2014 <- ~lapply(sprintf(
"<strong>Location ID: %s</strong><br/>2013 Cases: %d<br/>2014 Cases: %d<br/>YoY Change: %s",
SP_ID, Den2013, Den2014, scales::percent(YoY_2014, accuracy = 0.1)
), HTML)
popup_2015 <- ~lapply(sprintf(
"<strong>Location ID: %s</strong><br/>2014 Cases: %d<br/>2015 Cases: %d<br/>YoY Change: %s",
SP_ID, Den2014, Den2015, scales::percent(YoY_2015, accuracy = 0.1)
), HTML)
# Labels appear on hover
label_2013 <- ~scales::percent(YoY_2013, accuracy = 0.1)
label_2014 <- ~scales::percent(YoY_2014, accuracy = 0.1)
label_2015 <- ~scales::percent(YoY_2015, accuracy = 0.1)
# Load and transform the neighborhood shapefile
town_map_raw <- st_read("Merida_S25/Merida_Den12_13_14_15/Merida_Den12_13_14_15.shp")
## Reading layer `Merida_Den12_13_14_15' from data source
## `C:\Users\pompo\OneDrive\Documents\GitHub\dengueAnalysisTechWriting\Merida_S25\Merida_Den12_13_14_15\Merida_Den12_13_14_15.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 540 features and 6 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 212287.2 ymin: 2309495 xmax: 236386 ymax: 2333181
## Projected CRS: Mexico ITRF2008 / UTM zone 16N
town_map_wgs84 <- st_transform(town_map_raw, crs = 4326)
# Create the interactive leaflet map
m <- leaflet(data_sf_wgs84) %>%
# Add base map layers
addProviderTiles(providers$CartoDB.Positron, group = "Light Map") %>%
addProviderTiles(providers$Esri.WorldImagery, group = "Satellite") %>%
# Add Neighborhoods layer
addPolygons(
data = town_map_wgs84,
group = "Neighborhoods",
fillOpacity = 0.1,
color = "#444",
weight = 1,
opacity = 0.7,
popup = ~~SP_ID
) %>%
# Set the map to focus on the data
fitBounds(
~as.numeric(st_bbox(geometry)[1]),
~as.numeric(st_bbox(geometry)[2]),
~as.numeric(st_bbox(geometry)[3]),
~as.numeric(st_bbox(geometry)[4])
) %>%
# ==== Layer 1: 2013 vs 2012 ====
addCircleMarkers(
group = "2013 vs 2012",
color = ~pal_yoy(YoY_2013),
fillOpacity = 0.8,
stroke = FALSE,
radius = 5,
popup = popup_2013,
label = label_2013
) %>%
# ==== Layer 2: 2014 vs 2013 ====
addCircleMarkers(
group = "2014 vs 2013",
color = ~pal_yoy(YoY_2014),
fillOpacity = 0.8,
stroke = FALSE,
radius = 5,
popup = popup_2014,
label = label_2014
) %>%
# ==== Layer 3: 2015 vs 2014 ====
addCircleMarkers(
group = "2015 vs 2014",
color = ~pal_yoy(YoY_2015),
fillOpacity = 0.8,
stroke = FALSE,
radius = 5,
popup = popup_2015,
label = label_2015
) %>%
# ==== Legend ====
addLegend(
pal = pal_yoy,
values = cap_range,
title = "YoY % Change",
position = "bottomright",
labFormat = labelFormat(suffix = "%", transform = function(x) x * 100),
opacity = 1
) %>%
# ==== Layer Control ====
addLayersControl(
baseGroups = c("2013 vs 2012", "2014 vs 2013", "2015 vs 2014"),
overlayGroups = c("Light Map", "Satellite", "Neighborhoods"),
options = layersControlOptions(collapsed = FALSE)
)
## Warning in pal_yoy(YoY_2013): Some values were outside the color scale and will
## be treated as NA
## Warning in pal_yoy(YoY_2013): Some values were outside the color scale and will
## be treated as NA
## Warning in pal_yoy(YoY_2014): Some values were outside the color scale and will
## be treated as NA
## Warning in pal_yoy(YoY_2014): Some values were outside the color scale and will
## be treated as NA
## Warning in pal_yoy(YoY_2015): Some values were outside the color scale and will
## be treated as NA
## Warning in pal_yoy(YoY_2015): Some values were outside the color scale and will
## be treated as NA
# Display the map
m
Our exploratory analysis highlights the spatial and temporal patterns of Dengue Fever in Mérida from 2012 to 2015, revealing that the disease is not randomly distributed, with certain neighborhoods consistently experiencing higher case counts. The year-over-year trends and animated maps show both declines and resurgences, suggesting that Dengue outbreaks are influenced by underlying environmental and/or societal factors.
For future work, the analysis could be extended to incorporate additional data to gain deeper insights into disease dynamics and etiology. We briefly outline some examples below.
Demographics: Examine age, gender, socioeconomic status, proximity to healthcare, education, of affected populations.
Ovitrap data: Explore correlations between mosquito populations and case counts using statistical tests.
Environmental variables: Investigate the impact of vegetation density, proximity to still water, etc.
Population density: Assess how urban density influences the spread of dengue.
By integrating these factors, future studies could better develop predictive models to better anticipate outbreaks and inform public health interventions, such as targeted mosquito control or community education campaigns.